The suite is composed of various checks such as: Identifier Label Correlation, Is Single Value, Feature Feature Correlation, etc...
Each check may contain conditions (which will result in pass ✓ /
fail ✖ / warning ! / error ⁈) as well as
other outputs such as plots or tables.
Suites, checks and conditions can all be modified. Read more about
custom suites.
| Status | Check | Condition | More Info |
|---|---|---|---|
✖ |
Feature Label Correlation | Features' Predictive Power Score is less than 0.8 | Found 2 out of 4 features with PPS above threshold: {'petal.width': '0.93', 'petal.length': '0.86'} |
✖ |
Feature-Feature Correlation | Not more than 0 pairs are correlated above 0.9 | Correlation is greater than 0.9 for pairs [('petal.length', 'petal.width')] |
✓ |
Data Duplicates | Duplicate data ratio is less or equal to 5% | Found 0.67% duplicate data |
✓ |
Single Value in Column | Does not contain only a single value | Passed for 5 relevant columns |
✓ |
Special Characters | Ratio of samples containing solely special character is less or equal to 0.1% | Passed for 5 relevant columns |
✓ |
Mixed Nulls | Number of different null types is less or equal to 1 | Passed for 5 relevant columns |
✓ |
Mixed Data Types | Rare data types in column are either more than 10% or less than 1% of the data | 5 columns passed: found 0 columns with negligible types mix, and 5 columns without any types mix |
✓ |
String Mismatch | No string variants | Passed for 1 relevant column |
✓ |
String Length Out Of Bounds | Ratio of string length outliers is less or equal to 0% | Passed for 1 relevant column |
✓ |
Conflicting Labels | Ambiguous sample ratio is less or equal to 0% | Ratio of samples with conflicting labels: 0% |
Return the PPS (Predictive Power Score) of all features in relation to the label. Read More...
| Status | Condition | More Info |
|---|---|---|
✖ |
Features' Predictive Power Score is less than 0.8 | Found 2 out of 4 features with PPS above threshold: {'petal.width': '0.93', 'petal.length': '0.86'} |
Checks for pairwise correlation between the features. Read More...
| Status | Condition | More Info |
|---|---|---|
✖ |
Not more than 0 pairs are correlated above 0.9 | Correlation is greater than 0.9 for pairs [('petal.length', 'petal.width')] |
Checks for duplicate samples in the dataset. Read More...
| Status | Condition | More Info |
|---|---|---|
✓ |
Duplicate data ratio is less or equal to 5% | Found 0.67% duplicate data |
| sepal.length | sepal.width | petal.length | petal.width | variety | ||
|---|---|---|---|---|---|---|
| Instances | Number of Duplicates | |||||
| 142, 101 | 2 | 5.80 | 2.70 | 5.10 | 1.90 | Virginica |
Detects outliers in a dataset using the LoOP algorithm. Read More...
| Outlier Probability Score | sepal.length | sepal.width | petal.length | petal.width | variety | |
|---|---|---|---|---|---|---|
| 41 | 0.91 | 4.50 | 2.30 | 1.30 | 0.30 | Setosa |
| 106 | 0.69 | 4.90 | 2.50 | 4.50 | 1.70 | Virginica |
| 109 | 0.67 | 7.20 | 3.60 | 6.10 | 2.50 | Virginica |
| 6 | 0.63 | 4.60 | 3.40 | 1.40 | 0.30 | Setosa |
| 59 | 0.61 | 5.20 | 2.70 | 3.90 | 1.40 | Versicolor |
| Check | Reason |
|---|---|
| Identifier Label Correlation - Train Dataset | DatasetValidationError: Dataset does not contain an index or a datetime. see Dataset docs |
| Single Value in Column | Nothing found |
| Mixed Nulls | Nothing found |
| Special Characters | Nothing found |
| Mixed Data Types | Nothing found |
| String Mismatch | Nothing found |
| String Length Out Of Bounds | Nothing found |
| Conflicting Labels | Nothing found |